System and method for least mean fourth adaptive filtering

ABSTRACT

The system and method for least mean fourth adaptive filtering is a system that uses a general purpose computer or a digital circuit (such as an ASIC, a field-programmable gate array, or a digital signal processor that is programmed to utilize a normalized least mean fourth algorithm. The normalization is performed by dividing a weight vector update term by the fourth power of the norm of the regressor.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to digital signal processing techniquesand signal filtering, and particularly to a system and method for leastmean fourth adaptive filtering that provides an adaptive filter andmethod of adaptive filtering utilizing a normalized least mean fourthalgorithm.

2. Description of the Related Art

The least mean fourth (LMF) algorithm has uses in a wide variety ofapplications. The algorithm outperforms the well-known least mean square(LMS) algorithm in cases with non-Gaussian noise. Even in Gaussianenvironments, the LMF algorithm can outperform the LMS algorithm wheninitialized far from the Wiener solution. The true usefulness of the LMFalgorithm lies in its faster initial convergence and lower steady-stateerror relative to the LMS algorithm. More importantly, its mean-fourtherror cost function yields better performance than that of the LMS fornoise of sub-Gaussian nature, or noise with a light-tailed probabilitydensity function. However, this higher-order algorithm requires a muchsmaller step-size to ensure stable adaptation, since the cubed error inthe LMF gradient vector can cause devastating initial instability,resulting in unnecessary performance degradation.

One approach to solving the above degradation problem is to normalizethe weight update term of the algorithm. In conventional techniques, theLMF algorithm is normalized by dividing the weight vector update term bythe squared norm of the regressor. In some prior art techniques, thisnormalization is performed by dividing the weight vector update term bya weighted sum of the squared norm of the regressor and the squared normof the estimation error vector. Thus, the LMF algorithm is normalized byboth the signal power and the error power. Combining the signal powerand the error power has the advantage that the former normalizes theinput signal, while the latter can dampen down the outlier estimationerrors, thus improving stability while still retaining fast convergence.

The above has been modified by using an adaptive, rather than fixed,mixing parameter in the weighted sum of the squared norm of theregressor and the squared norm of the estimation error vector. Theadaptation of the mixing parameter improves the tracking properties ofthe algorithm. However, unlike the normalization of the LMS algorithm,the above normalization techniques of the LMF algorithm do not protectthe algorithm from divergence when the input power of the adaptivefilter increases. In fact, as will be shown below, the above prior artnormalized LMF (NLMF) algorithms diverge when the input power of theadaptive filter exceeds a threshold that depends on the step-size of thealgorithm. The reason for this drawback is that in all of the abovetechniques, the weight vector update term, which is a fourth orderpolynomial of the regressor, is normalized by a second order polynomialof the regressor.

The prior art normalization techniques do not ensure a normalization ofthe input signal. Thus, the algorithm stability remains dependent on theinput power of the adaptive filter.

For the particular application of adaptive plant identification, asdiagrammatically illustrated in FIG. 1, the output a_(k) of the unknownplant 100 is given by:a _(k) =g ^(T) x _(k) +b _(k)  (1)whereg≡(g ₁ ,g ₂ , . . . ,g _(N))^(T)  (2)is a vector composed of the plant parameters g_(i) (from an unknownfinite impulse response (FIR) filter 102), andx _(k)=(x _(k) ,x _(k−1) ,x _(k−2) , . . . ,x _(k−N+1))^(T)  (3)is the regressor vector at time k. N is the number of plant parameters,x_(k) is the plant input, b_(k) is the plant noise, and the notation(.)^(T) represents the transpose of (.). The identification of the plantis made by an adaptive finite impulse response (FIR) filter 104 whoselength is assumed equal to that of the plant. The weight vector h_(k) ofthe adaptive filter is adapted on the basis of the error e_(k), which isgiven bye _(k) =a _(k) −h _(k) ^(T) x _(k),  (4)where h_(k)=(h_(1,k), h_(2,k), . . . , h_(N,k))^(T). The adaptationalgorithm of interest is the LMF algorithm, which is described byh _(k+1) =h _(k) +ƒe _(k) ³ x _(k),  (5)where μ>0 is the algorithm step-size. The error signal e_(k) can bedecomposed into two terms as follows:e _(k) ≡b _(k)+ε_(k).  (6)

The first term on the right hand side of equation (6), b_(k), is theplant noise. The second term, ε_(k), is the excess estimation error. Theweight deviation vector is defined byv _(k) ≡h _(k) −g.  (7)From equations (1), (4), (6) and (7),ε_(k) =−v _(k) ^(T) x _(k).  (8)

Inserting equations (1), (4) and (7) into equation (5) yieldsv _(k+1) =v _(k)+μ(b _(k) −v _(k) ^(T) x _(k))³ x _(k).  (9)

In the above, the following assumptions are used: The first assumption(assumption A1) is that the sequences {x_(k)} and {b_(k)} are mutuallyindependent. The second assumption (assumption A2) is that {x_(k)} is astationary sequence of zero mean random variables with a finite varianceσ_(x) ². The third assumption (assumption A3) is that {b_(k)} is astationary sequence of independent zero mean random variables with afinite variance σ_(b) ². Such assumptions are typical in the context ofadaptive filtering.

In order to emphasize the need for normalization in the least meanfourth algorithm, examining the normalization of the LMS algorithm isimportant. The stability of the LMS algorithm is dependent upon theinput power of the adaptive filter. This makes it very hard, if notimpossible, to choose a step-size that guarantees stability of thealgorithm when there is lack of knowledge about the input power. This issolved by normalizing the weight update term by ∥x_(k)∥², where ∥x_(k)∥is the Euclidean norm of the vector x_(k), which is defined as∥x_(k)∥=√{square root over (x_(k) ^(T)x_(k))}. The resulting algorithmis referred to as the normalized LMS (NLMS) algorithm. This algorithm isstable for all values of the filter input power so long as the step-sizeis between 0 and 2.

It is desirable to develop a version of the LMF algorithm that has asimilar feature as that of the NLMS algorithm; i.e., stability for allvalues of the filter input power for an appropriate fixed range of thestep-size. One prior art normalization technique is given as

$\begin{matrix}{{v_{k + 1} = {v_{k} + {{\mu\left( {b_{k} - {v_{k}^{T}x_{k}}} \right)}^{3}\frac{x_{k}}{{x_{k}}^{2}}}}},} & (10)\end{matrix}$and a second version is given by

$\begin{matrix}{{v_{k + 1} = {v_{k} + {{\mu\left( {b_{k} - {v_{k}^{T}x_{k}}} \right)}^{3}\frac{x_{k}}{\delta + {\lambda{x_{k}}^{2}} + {\left( {1 - \lambda} \right){e_{k}}^{2}}}}}},} & (11)\end{matrix}$where δ is a small positive number, 0<δ<1, and e_(k)=(e_(k), e_(k−1),e_(k−2), . . . , e_(k−N+1))^(T) is the error vector. The parameter λ isreferred to as the mixing power parameter. The choice of λ is acompromise between fast convergence and low steady-state error. However,the stability of the above algorithms depends on the mean square inputof the adaptive filter. To show this undesired feature for the algorithmof equation (10), we may consider the scalar case, N=1, with zero noise,b_(k)=0, and binary input, x_(k) ε{−1,1}, with μ=0.5. In this case,equation (10) implies that:v _(k+1) =v _(k)−0.5v _(k) ³.  (12)

If v₁=1, then equation (12) implies that v₂=0.5, v₃=0.4375, v₄=0.3956,v₅=0.3647, etc. Thus, v_(k) is decaying in this case. Repeating thisexample with x_(k) ε{−4,4}, while keeping all other conditionsunchanged, equation (10) implies that:v _(k+1) =v _(k)−8v _(k) ³.  (13)

Again, if v₁=1, then equation (13) yields v₂=−7, v₃=2737, v₄=1.6(10¹¹),v₅=3.5(10³⁴), etc. Thus, the algorithm of equation (10) diverges in thiscase. This shows that the stability of the normalized LMF algorithm ofequation (10) depends on the input power of the adaptive filter.

It can also be shown that the stability of the algorithm of equation(11) also depends on the input power. We may consider again the scalarcase with μ=0.5, δ=0, λ=0.5, x_(k) ε{−1,1}, and b_(k)=0. In this case,equations (6) and (8) imply that e_(k) ²=v_(k) ²x_(k) ² and equation(11) implies that:

$\begin{matrix}{v_{k + 1} = {v_{k} - {\frac{v_{k}^{3}}{1 + v_{k}^{2}}.}}} & (14)\end{matrix}$

If v₁=1, then equation (14) produces v₂=0.5, v₃=0.4, v₄=0.3448,v₅=0.3082, etc. Thus, v_(k) is decaying in this case. Repeating thisexample with x_(k) ε{−4,4}, while keeping all other conditionsunchanged, equation (11) implies that:

$\begin{matrix}{v_{k + 1} = {v_{k} - {\frac{16\; v_{k}^{3}}{1 + v_{k}^{2}}.}}} & (15)\end{matrix}$

Again, if v₁=1, then equation (15) produces v₂=−7, v₃=102.76, v₄=1541.2,v₅=23119, etc. Thus, the algorithm of equation (11) diverges in thiscase. This shows that the stability of the normalized LMF algorithm ofequation (11) depends on the input power of the adaptive filter.

The above results regarding the dependence of the stability of the priorart NLMF algorithms on the input power of the adaptive filter suggestthe need for an NLMF algorithm whose stability does not depend on theinput power. Thus, a system and method for least mean fourth adaptivefiltering solving the aforementioned problems is desired.

SUMMARY OF THE INVENTION

The system and method for least mean fourth adaptive filtering is asystem that uses a general purpose computer or a digital circuit (suchas an ASIC, a field-programmable gate array, or a digital signalprocessor that is programmed to utilize a normalized least mean fourthalgorithm. The normalization is performed by dividing a weight vectorupdate term by the fourth power of the norm of the regressor. Thenormalized least mean fourth algorithm remains stable as the filterinput power increases.

The least mean fourth adaptive filter includes a finite impulse responsefilter having a desired output a_(k) at a time k. An impulse response ofthe finite impulse response filter is defined by a set of weightingfilter coefficients h_(k), and an input signal of the finite impulseresponse filter is defined by a regressor input signal vector, x_(k), atthe time k. An error signal e_(k) at the time k is calculated as adifference between the desired output a_(k) at the time k and anestimated signal given by h_(k) ^(T)x_(k), such that e_(k)=a_(k)−h_(k)^(T)x_(k). The set of weighting filter coefficients are iterativelyupdated as

${h_{k + 1} = {h_{k} + {\alpha\frac{e_{k}^{3}x_{k}}{{x_{k}}^{4}}}}},$where α is a fixed positive number step-size.

These and other features of the present invention will become readilyapparent upon further review of the following specification anddrawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram of a plant to be identified digital signalprocessing techniques.

FIG. 2 is a graph illustrating simulation results of a system and methodfor least mean fourth adaptive filtering according to the presentinvention, validating step-size stability bounds for Gaussian input andnoise.

FIG. 3 is a graph illustrating simulation results of the system andmethod for least mean fourth adaptive filtering according to the presentinvention, validating step-size stability bounds for uniformlydistributed input and noise.

FIG. 4 is a graph illustrating simulation results of the system andmethod for least mean fourth adaptive filtering according to the presentinvention, validating step-size stability bounds for correlated input.

FIG. 5 is a graph comparing initial convergence of the present methodfor normalized least mean fourth (NLMF) adaptive filtering against thatof conventional least mean fourth (LMF) adaptive filtering, least meansquare (LMS) adaptive filtering and normalized least mean square (NLMS)adaptive filtering.

FIG. 6 is a block diagram illustrating a telecommunication system thatincludes echo cancellation and employs a present system and method forleast mean fourth adaptive filtering according to the present invention.

FIG. 7 is a block diagram illustrating an echo canceller used in thesystem of FIG. 6.

Similar reference characters denote corresponding features consistentlythroughout the attached drawings.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The system and method for least mean fourth adaptive filtering is asystem that uses a general purpose computer or a digital circuit (suchas an ASIC, a field-programmable gate array, or a digital signalprocessor that is programmed to utilize a normalized least mean fourthalgorithm. The normalization is performed by dividing a weight vectorupdate term by the fourth power of the norm of the regressor. Thenormalized least mean fourth algorithm remains stable as the filterinput power increases.

In order to examine the present least mean fourth adaptive filter, it isuseful to first consider the scalar ease, in which N=1 (where N is thenumber of plant parameters) with zero noise. In such a case, equation(9) simplifies to:v _(k+1)=(1−μx _(k) ⁴ v _(k) ²)v _(k).  (16)The stability of the algorithm represented by equation (16) depends onthe statistics of x_(k) (i.e., the plant input). This dependence isremoved by using a variable step-size μ that satisfies the following:

$\begin{matrix}{{\mu = \frac{\alpha}{x_{k}^{4}}},} & (17)\end{matrix}$assuming that x_(k)≠0 for all k and that α is a fixed positive number.In such a case, equation (16) becomesv _(k+1)=(1−αv _(k) ²)v _(k).  (18)

Since α does not depend on x_(k), the sequence {v_(k)} generated byequation (18) also does not depend on x_(k). Thus, the stability ofequation (18) does not depend on the statistics of x_(k). A sufficientcondition for the decay of |v_(k)| for all k is that:

$\begin{matrix}{{0 < \alpha < \frac{2}{v_{k}^{2}}},{{for}\mspace{14mu}{all}\mspace{14mu}{k.}}} & (19)\end{matrix}$

When |v_(k)| is decreasing, |v₁| will be the largest value of |v_(k)|.Then, a sufficient condition for equation (19) is:

$\begin{matrix}{0 < \alpha < {\frac{2}{v_{1}^{2}}.}} & (20)\end{matrix}$Equation (20) is the step-size range that is sufficient for theconvergence of the algorithm represented by equation (16).

The above can now be extended to the N-dimensional LMF algorithmdescribed by equation (9). A factor that determines the convergence ofthe algorithm is its behavior at the start of adaptation. At the startof adaptation, the magnitude of the deviation vector v_(k) is usually solarge that the excess estimation error v_(k) ^(T)x_(k) dominates theplant noise b_(k). In such a case, equation (9) can be approximated as:v _(k+1) ≈v _(k)−μ(v _(k) ^(T) x _(k))³ x _(k) =T _(k) v _(k),  (21)where the transformation matrix T_(k) is given by:T _(k) =I−μx _(k) x _(k) ^(T) v _(k) v _(k) ^(T) x _(k) x _(k)^(T).  (22)The transformation matrix of equation (22) depends on ∥x_(k)∥. Thisdependence can be removed by using a variable step-size μ thatsatisfies:

$\begin{matrix}{{\mu = \frac{\alpha}{{x_{k}}^{4}}},} & (23)\end{matrix}$assuming that x_(k)≠0 for all k and that α is a fixed positive number.In such a case, equation (22) becomes:

$\begin{matrix}{T_{k} = {I - {\alpha{\frac{x_{k}x_{k}^{T}v_{k}v_{k}^{T}x_{k}x_{k}^{T}}{{x_{k}}^{4}}.}}}} & (24)\end{matrix}$

The matrix T_(k) given by equation (24) depends on the direction of thevector x_(k), but it does not depend on its norm, ∥x_(k)∥. Insertingequation (23) into equation (5) yields:

$\begin{matrix}{h_{k + 1} = {h_{k} + {\alpha{\frac{e_{k}^{3}x_{k}}{{x_{k}}^{4}}.}}}} & (25)\end{matrix}$

Equation (25) represents the present NLMF algorithm. Due to equation(4), the weight vector update term on the right-hand side of equation(25) is proportional to the negative gradient of (e_(k)/∥x_(k)∥)⁴ withrespect to h_(k). Thus, the algorithm represented by equation (25) maybe viewed as a gradient algorithm based on the minimization of the meanfourth normalized estimation error E(e_(k)/∥x_(k)∥)⁴. Similar to theNLMS algorithm, the NLMF algorithm represented by equation (25) may beregularized by adding a small positive number to ∥x_(k)∥⁴ in order toavoid division by zero.

In the following, a condition on the step-size a is derived that issufficient for the convergence of the algorithm of equation (25). First,a step-size range is derived that is sufficient for stability of thealgorithm in the initial phase of adaptation. Then, a step-size rangethat is sufficient for steady-state stability is derived. The finalstep-size range is obtained from the intersection of these two ranges.

One is typically interested in mean-square stability that is based onthe evolution of the mean-square deviation E(∥v_(k)∥²) with time, whereE denotes the mathematical expectation operator. In the initial phase,the magnitude of E(v_(k)) is usually large in comparison with thefluctuation of v_(k) around E(v_(k)). Thus, E(∥v_(k)∥²)≈∥E(v_(k))∥² inthe initial phase of adaptation. Consequently, the derivation of thestability step-size range in the initial adaptation phase can besimplified by calculating it on the base of the mean of v_(k), ratherthan the mean square of v_(k). This is not the case with thesteady-state condition that should be derived on the base of themean-square of v_(k).

With respect to the initial phase condition, at the start of adaptation,the magnitude of the deviation vector v_(k) is usually so large that theexcess estimation error v_(k) ^(T)x_(k) dominates the plant noise b_(k).In such a case, equation (25) can be approximated by equation (21), withT_(k) being given by equation (24). Letting E(x|y) denote theconditional expectation of x given y, then a condition on a is derivedthat implies that:∥E(v _(k+1) |v _(k))∥<∥v _(k)∥.  (26)This condition means that the magnitude of the weight deviation vectoris decreasing each iteration step, in a mean sense. Due to equation(21), E(v_(k+1)|v_(k))=E(T_(k)|v_(k))v_(k). Then, a sufficient conditionfor equation (26) is that the magnitude of all the eigenvalues of thematrix E(T_(k)|v_(k)) are less than 1. Due to equation (24), theeigenvalues of E(T_(k)|v_(k)) are equal to 1−αλ_(i)(k), where λ_(i)(k);i=1, 2, . . . , N are the eigenvalues of the matrix P(k), defined by:

$\begin{matrix}{{P(k)} \equiv {{E\left( \frac{x_{k}x_{k}^{T}v_{k}v_{k}^{T}x_{k}x_{k}^{T}}{{x_{k}}^{4}} \middle| v_{k} \right)}.}} & (27)\end{matrix}$Thus, a sufficient condition for equation (26) is that:∥1−αλ_(i)(k)|<1,i=1,2, . . . ,N  (28)

It is assumed that the matrix P(k) is positive definite. This assumptionis a sort of persistent excitation assumption. A geometrical validationof this assumption is given below. From equation (27), the trace of thematrix P(k) is less than or equal to ∥v_(k)∥². Thus,0<λ_(i)(k)<∥v _(k)∥² ,i=1,2, . . . ,N.  (29)Equation (29) implies that a sufficient condition for equation (28) isthat

$\begin{matrix}{0 < \alpha < {\frac{2}{{v_{k}}^{2}}.}} & (30)\end{matrix}$

Physically speaking, in order to satisfy equation (30) for all k, it issufficient to satisfy it at k=1, since the maximum weight deviationtakes place at the start of adaptation. This leads to the followingcondition:

$\begin{matrix}{0 < \alpha < {\frac{2}{{v_{1}}^{2}}.}} & (31)\end{matrix}$Equation (31) is the step-size range that is sufficient for stability ofthe NLMF algorithm of equation (25) in the initial phase of adaptation.This step-size range does not depend on the input power of the adaptivefilter. The range given by equation (31) reflects the dependence of thestability of the LMF algorithm on the weight initialization of theadaptive filter.

The following provides a geometrical validation of the above assumptionthat the matrix P(k) is positive definite. For any vector u in theN-dimensional space, equation (27) implies that:

$\begin{matrix}{{{u^{T}{P(k)}u} = {E\left\lfloor {{u_{x}^{2}(k)}{v_{x}^{2}(k)}} \middle| v_{k} \right\rfloor}},{where}} & (32) \\{{u_{x}(k)} \equiv \frac{u^{T}x_{k}}{x_{k}}} & (33)\end{matrix}$is the projection of the vector u on the vector x_(k), and

$\begin{matrix}{{v_{x}(k)} \equiv \frac{v_{k}^{T}x_{k}}{x_{k}}} & (34)\end{matrix}$is the projection of the vector v_(k) on the vector x_(k). When x_(k) ispersistently exciting, it spans the whole N-dimensional space. Then,with non-zero probability, the projections of given vectors u and v_(k)on x_(k) will be different from zero, which implies that u_(x) ²(k)v_(x) ²(k) will be positive. This implies that the right-hand side ofequation (32) will be positive, which implies that the matrix P(k) ispositive definite.

With regard to the steady-state condition, a sufficient condition of thesteady-state stability of the mean square deviation of the regular LMFalgorithm of equation (5) is given by:

$\begin{matrix}{0 < \mu < {\frac{E\left( b_{k}^{2} \right)}{10\;{E\left( b_{k}^{4} \right)}{E\left( {x_{k}}^{2} \right)}}.}} & (35)\end{matrix}$For long adaptive filters, ∥x_(k)∥⁴ can be approximated by E(∥x_(k)∥⁴).Then, equation (23) implies that the condition of equation (35) on μ canbe mapped to the following condition on α:

$\begin{matrix}{0 < \alpha < {\frac{{E\left( b_{k}^{2} \right)}{E\left( {x_{k}}^{4} \right)}}{10\;{E\left( b_{k}^{4} \right)}{E\left( {x_{k}}^{2} \right)}}.}} & (36)\end{matrix}$Equation (36) is the step-size range that is sufficient for stability ofthe present NLMF algorithm of equation (25) in the steady-state.

With regard to the final step-size condition, equation (31) issufficient for convergence of the NLMF algorithm of equation (25) in theinitial phase of adaptation, while equation (36) is sufficient forstability around the Wiener solution. Both properties can be achieved byusing the step-size to satisfy the following condition:

$\begin{matrix}{{0 < \alpha < \alpha_{o}},{where}} & (37) \\{\alpha_{o} = {\min{\left\{ {\frac{2}{{v_{1}}^{2}},\frac{{E\left( b_{k}^{2} \right)}{E\left( {x_{k}}^{4} \right)}}{10\;{E\left( b_{k}^{4} \right)}{E\left( {x_{k}}^{2} \right)}}} \right\}.}}} & (38)\end{matrix}$

For a long adaptive filter, E(∥x_(k)∥⁴)/E(∥x_(k)∥²)≈Nσ_(x) ². Then, thesecond term in the argument of the min {.} function on the right-handside of equation (38) is increasing in N and the signal-to-noise ratioσ_(x) ²/σ_(b) ². Thus, for the long adaptive filter, non-smallsignal-to-noise ratio, and non-small initial weight deviation ∥v₁∥,

$\begin{matrix}{\frac{2}{{v_{1}}^{2}} \leq {\frac{{E\left( b_{k}^{2} \right)}{E\left( {x_{k}}^{4} \right)}}{10\;{E\left( b_{k}^{4} \right)}{E\left( {x_{k}}^{2} \right)}}.}} & (39)\end{matrix}$

For illustration of equation (39), an example with N=32, binary inputx_(k)ε{−σ_(x), σ_(x)}, binary noise b_(k)ε{−σ_(b),σ_(b)}, and where∥v₁∥=1 is considered. In this example, equation (39) holds as long asthe signal-to-noise ratio σ_(x) ²/σ_(b) ² is greater than ⅝. Equations(38) and (39) imply that:

$\begin{matrix}{\alpha_{o} = {\frac{2}{{v_{1}}^{2}}.}} & (40)\end{matrix}$

Thus, the final step-size condition of equation (37) is identical withthe initial phase condition of equation (31). Thus, the condition ofequation (31) is sufficient for the stability of the NLMF algorithm ofequation (25) in applications with a long adaptive filter and non-smallsignal-to-noise ratio. This condition is validated by the simulationresults given below.

The following simulations were performed for the case of adaptive plantidentification, as illustrated in FIG. 1. The plant 100 is made of atime-invariant FIR filter 102 with N=32 parameters. The regressor vectoris given by:x _(k)(x _(k) ,x _(k−1) , . . . ,x _(k−N+1))^(T),  (41)where x_(k) is the plant input. x_(k) is a zero mean, independent andidentically distributed (HD) Gaussian sequence with variance σ_(x) ².The plant noise is a zero mean IID Gaussian sequence with variance σ_(b)². The plant parameters are given by:

$\begin{matrix}{g_{i} = \left\{ \begin{matrix}{{\rho\; i},} & {1 \leq i \leq {N/2}} \\{{\rho\left( {N + 1 - i} \right)},} & {{N/2} < i \leq {N.}}\end{matrix} \right.} & (42)\end{matrix}$The value of ρ is chosen such that ∥g∥=1. In the exemplary ease of thesimulation, N=32 and ρ=0.0183. The initial weight vector of the adaptivefilter is h₁=0, thus ∥v₁∥=1.

To study the dependence of the stability of the conventional NLMFalgorithms of equations (10) and (11) and the present NLMF algorithm ofequation (25) on the input power of the filter, the algorithms aresimulated over a wide range of σ_(x), ranging from 0.2 to 1000. Theconsidered value of σ_(b) is 0.01. For the algorithm represented byequation (10), μ=1. For the algorithm represented by equation (11), μ=1,δ=0, and λ=0.5. For the algorithm represented by equation (25), α=1. Thesimulations have shown that the conventional NLMF algorithms ofequations (10) and (11) diverge for σ_(x)=1 and above, whereas thepresent NLMF algorithm of equation (25) is non-divergent for all σ_(x).

In order to validate the step-size condition given by equation (31) ofthe present NLMF algorithm, the maximum step-size for which thealgorithm is convergent is determined by simulations and compared withthe step-size bound provided by equation (31). This is performed atseveral values of σ_(x). The considered noise variance, plant parametervector, and initial weight vector of the adaptive filter are as givenabove. The results are shown in FIG. 2. The stability step-size boundobtained by simulation is greater than that provided by the condition ofequation (31). This is a validation of equation (31). It should be notedthat the stability condition of equation (31) is sufficient, but notnecessary. This means that the algorithm is not only convergent for allvalues of a satisfying equation (31), but it may also converge for somevalues of a that do not satisfy equation (31). Therefore, all that isneeded from the simulations is to show that the range defined byequation (31) is included in the range obtained by simulations, which isthe case with the results of FIG. 2. It is not required to show that thestep-size range defined by equation (31) coincides with the rangeobtained by simulations.

To validate the step-size condition of equation (31) for non-Gaussianplant input and plant noise, the simulations of FIG. 2 are re-performedfor uniformly distributed input and noise. The considered noisevariance, plant parameter vector, and initial weight vector of theadaptive filter are the same as those considered in FIG. 2. The resultsare shown in FIG. 3. These results validate the sufficiency of thestep-size condition given by equation (31) for the stability of thealgorithm over a wide range of the input power of the adaptive filter.

To validate the step-size condition given by equation (31) for non-whiteinput of the plant, the simulations of FIG. 2 are re-performed for anautoregressive x_(k) satisfyingx _(k) =βx _(k−1)+σ_(x)√{square root over (1−β²)}w _(k),0≦β<1,  (43)where w_(k) is a zero mean, unity variance IID Gaussian sequence. Theparameter β controls the degree of correlation of the sequence {x_(k)};the greater β is, the stronger the correlation. In the simulations, theconsidered value of β was 0.95, which corresponds to a strongcorrelation of the sequence {x_(k)}. This implies both a strongcorrelation among the components of the regressor and a strongcorrelation between successive regressors. The noise is white Gaussian.The considered noise variance, plant parameter vector, and initialweight vector of the adaptive filter are the same as those considered inFIG. 2. The results are shown in FIG. 4. These results validate thesufficiency of the step-size condition given by equation (31) for thestability of the algorithm over a wide range of the input power of theadaptive filter.

Finally, FIG. 5 compares the initial convergence of the present NLMFalgorithm with those of the LMF, NLMS, and LMS algorithms for whiteGaussian x_(k) and b_(k) with σ_(x)=1, and σ_(b)=0.1. FIG. 5 shows theevolution of the excess mean square error (MSE) of the algorithms withtime. The instantaneous excess MSE is defined as E(ε_(k) ²), where ε_(k)is defined by equation (8). The step-sizes of the algorithms are chosensuch that they have the same steady-state excess MSE. The step-size ofthe present NLMF algorithm is 1. The resulting steady-state excess MSEis 1×10⁻⁵. The step-sizes of the LMF, NLMS, and LMS algorithms havingthe same steady-state excess MSE are 0.001, 0.002, and 6.25×10⁻⁵,respectively. The instantaneous excess MSE is evaluated by averagingε_(k) ² over 1,000 independent runs.

FIG. 5 shows that the initial convergence of the present NLMF algorithmis almost the same as that of the LMF algorithm for the samesteady-state excess MSE. This means that the present NLMF algorithmretains the fast initial convergence advantage of the LMF algorithm.With regard to the similarity of the transient performances of the LMFand NLMF algorithms, for large N, ∥x_(k)∥²≦Nσ_(x) ². This implies thatthe behavior of the NLMF algorithm of equation (25) will be close tothat of the LMF algorithm of equation (5) with μ=α/(N²σ_(x) ⁴). Thiscondition is satisfied by the values of μ and α considered in FIG. 5.FIG. 5 also shows that the initial convergence of the NLMS algorithm isalmost the same as that of the LMS algorithm and that they aresignificantly slower than the LMF and NLMF algorithms.

FIGS. 6 and 7 illustrate a particular application of the present systemand method for least mean fourth adaptive filtering. Echoes aregenerated whenever part of a speech is reflected back to the source bythe floor, walls, or other neighboring objects. An echo is noticeable(or audible) only if the time delay between it and the speech exceeds afew tens of milliseconds. As the result of impedance mismatches intelephone circuits, echoes are also generated. The echoes arise invarious situations in telecommunications networks and impaircommunication quality. Long-delay echoes are often irritating to theuser, whereas shorter ones, called “sidetones”, are actually desirableand are intentionally inserted in telecommunications networks to makethe telephone circuit seem “alive”.

Echoes with long delay are observed only on long-distance connections.To clearly understand the echo phenomenon, FIG. 6 illustrates a typicallong distance telephone connection 200. Central to such a connection, apair of two-wire segments 202, 204 are provided, the ends of whichconnect a customer C to a central office O, along with a four-wirecarrier section 206 (which might include satellite links). The two-wirecircuits 202, 204 are bidirectional, whereas the four-wire circuits 206are made of two distinct channels, one for each direction.

In order to counteract the echo phenomenon, schemes must be developed toeither completely eliminate it (i.e., the ideal requirement), or to atleast substantially reduce its adverse effect so as to achieve atransmission of good quality. Echo cancellation is a suitable area forthe application of adaptive filtering. An adaptive echo canceller 208,210 estimates the responses of an underlying echo-generating system inreal time in the face of unknown and time-varying echo pathcharacteristics, generates a synthesized echo based on the estimate, andcancels the echo by subtracting the synthesized echo from the receivedsignal.

In FIG. 6, echo cancellers 208, 210 are identical, and FIG. 7illustrates a block diagram of the echo canceller 208. The syntheticecho y_(k)′ is generated by passing the reference signal through anadaptive filter that ideally matches the impulse response of the echopath. Thus, the transversal filter generates an estimate y_(k)′ of theecho, given by:

$\begin{matrix}{{y_{k}^{\prime} = {\sum\limits_{i = 0}^{N - 1}\;{h_{i,k}x_{k - i}}}},} & (44)\end{matrix}$where {h_(i,k)} is the estimated echo-path impulse response sample,x_(i) is the input sample to the i^(th)-tap delay, and N is the numberof tap coefficients. While passing through the hybrid 212, the speechfrom customer C results in the echo signal y_(k). This echo, togetherwith the speech from office O, r_(k), constitutes the desired responsefor the adaptive filter. The canceller error signal is obtained asfollows:ζ_(k) =y _(k) −y _(k) ′+r _(k) =e _(k) +r _(k).  (45)

The error signal e_(k) of FIG. 7 is used to control the adjustments tothe adaptive filter coefficients according to the present normalizedleast mean fourth adaptive algorithm in order to continuously improvethe echo estimate y_(k)′.

Ideally, the system eventually converges to the condition ζ_(k)=r_(k).The effect of this ideal condition on the echo cancellation is naturallyof some concern. Convergence of the echo to zero, however, is not anadequate criterion of performance for a system of this type, since thisis possible only if y_(k) is exactly representable as the output of afixed-tap filter. A better performance criterion is the convergence ofthe filter's impulse response to the response of the echo path.

It is to be understood that the present invention is not limited to theembodiments described above, but encompasses any and all embodimentswithin the scope of the following claims.

We claim:
 1. A least mean fourth adaptive filter circuit, comprising afinite impulse response filter circuit having a desired output a_(k) ata time k, an impulse response defined by a set of weighting filtercoefficients h_(k), and an input signal defined by a regressor inputsignal vector at the time k, x_(k), the finite impulse filter circuitfurther having: means for calculating an error signal e_(k) at the timek as a difference between the desired output a_(k) at the time k and anestimated signal given by h_(k) ^(T)x_(k), such that e_(k)=a_(k)−h_(k)^(T)x_(k); and means for iteratively updating the set of weightingfilter coefficients as${h_{k + 1} = {h_{k} + {\alpha\frac{e_{k}^{3}x_{k}}{{x_{k}}^{4}}}}},$where α is a fixed positive number step-size.
 2. The least mean fourthadaptive filter circuit as recited in claim 1, wherein the desiredoutput a_(k) at the time k is defined as a_(k)=g^(T)x_(k)+b_(k), whereing is a vector composed of plant parameters and b_(k) represents plantnoise.
 3. The least mean fourth adaptive filter circuit as recited inclaim 2, wherein the fixed positive number step-size α is selected suchthat $0 < \alpha < {\frac{2}{v_{k}^{2}}.}$ wherein v_(k) is a weightdeviation vector at the time k, given by v_(k)≡h_(k)−g.
 4. A signalprocessor, comprising a processor and software means for processing asignal, the software means being executable by the processor, thesoftware means including; means for processing the signal through afinite impulse response filter having a desired output a_(k) at a timek, an impulse response defined by a set of weighting filter coefficientsh_(k), and an input signal defined by a regressor input signal vector atthe time k, x_(k); means for calculating an error signal e_(k) at thetime k as a difference between the desired output a_(k) at the time kand an estimated signal given by h_(k) ^(T)x_(k), such thate_(k)=a_(k)−h_(k) ^(T)x_(k); and means for iteratively updating the setof weighting filter coefficients as${h_{k + 1} = {h_{k} + {\alpha\frac{e_{k}^{3}x_{k}}{{x_{k}}^{4}}}}},$where α is a fixed positive number step-size.